th Large Installation System Administration
نویسندگان
چکیده
A common way for virtual machine cluster (VMC) to tolerate failures is to create distributed snapshot and then restore from the snapshot upon failure. However, restoring the whole VMC suffers from long restore latency due to large snapshot files. Besides, different latencies would lead to discrepancies in start time among the virtual machines. The prior started virtual machine (VM) thus cannot communicate with the VM that is still restoring, consequently leading to the TCP backoff problem. In this paper, we present a novel restore approach called HotRestore, which restores the VMC rapidly without compromising performance. Firstly, HotRestore restores a single VM through an elastic working set which prefetches the working set in a scalable window size, thereby reducing the restore latency. Second, HotRestore constructs the communication-induced restore dependency graph, and then schedules the restore line to mitigate the TCP backoff problem. Lastly, a restore protocol is proposed to minimize the backoff duration. In addition, a prototype has been implemented on QEMU/KVM. The experimental results demonstrate that HotRestore can restore the VMC within a few seconds whilst reducing the TCP backoff duration to merely dozens of milliseconds.
منابع مشابه
USENIX Association Proceedings of the 14 th Systems Administration Conference ( LISA 2000 ) New Orleans
Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.
متن کاملSimplifying System Administration Tasks: The UAMS Approach
The User Account Management System (UAMS) is an extension of the original User DataBase (UDB) system presented at the USENIX Large Installation System Administration Conference in 1990. This paper describes the extensions of the UDB system from a single administrative entity tool for a distributed set of computers to a multidepartmental system over the period of three years since the first pape...
متن کاملUSENIX Association Proceedings of the 17 th Large Installation Systems Administration Conference
While some work has discussed hiring system administrators, and other work has focused on the technical and mechanical requirements for terminating a system administrator, there has been very little published regarding how to review or evaluate a system administrator. This paper presents one approach to doing such a review, followed by scenarios that explore the approach. The system developed i...
متن کاملUSENIX Association November 9 – 14 , 2014 Seattle , WA Proceedings of the 28 th Large Installation System Administration Conference ( LISA 14 )
A common way for virtual machine cluster (VMC) to tolerate failures is to create distributed snapshot and then restore from the snapshot upon failure. However, restoring the whole VMC suffers from long restore latency due to large snapshot files. Besides, different latencies would lead to discrepancies in start time among the virtual machines. The prior started virtual machine (VM) thus cannot ...
متن کاملA simple installation and administration tool for the large-scaled PC cluster system: DCAST
In this paper, a new setup/administration tool for PC cluster systems is proposed. Recently, in the high performance computing eld, PC cluster systems are becoming popular. PC cluster systems consist of PCs connected via a network and are used for parallel and distributed computing. PC cluster systems achieve a good cost to performance ratio by using commodity hardware to construct the cluster....
متن کاملUSENIX Association Proceedings of LISA 2002 : 16 th Systems Administration Conference
Quotidian system administration is often characterized by the fulfillment of common user requests, especially on sites that serve a variety of needs. User creation, group management, and mail alias maintenance are just three examples of the many repetitive tasks that can crowd the sysadmin's day. Matters worsen when users neglect to provide necessary information for the job. They can grow bleak...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014